Clustering with density based initialization and Bhattacharyya based merging
نویسندگان
چکیده
Centroid based clustering approaches, such as k-means, are relatively fast but inaccurate for arbitrary shape clusters. Fuzzy c-means with Mahalanobis distance can accurately identify clusters if data set be modelled by a mixture of Gaussian distributions. However, they require number apriori and bad initialization cause poor results. Density methods, DBSCAN, overcome these disadvantages. may perform poorly when the dataset is imbalanced. This paper proposes method, named density Bhattacharyya merging on fuzzy clustering. The carried out estimation adaptive bandwidth using k-Nearest Orthant-Neighbor algorithm to avoid effects imbalanced local peaks point clouds constructed used initial cluster centers We use measure Jensen inequality find overlapped Gaussians merge them form single cluster. experiments variety datasets show that proposed has remarkable advantages especially arbitrarily shaped sets.
منابع مشابه
Merging Distance and Density Based Clustering
Clustering is an important data exploration task. Its use in data mining is growing very fast. Traditional clustering algorithms which no longer cater to the data mining requirements are modified increasingly. Clustering algorithms are numerous which can be divided in several categories. Two prominent categories are distance-based and density-based (e.g. K-means and DBSCAN, respectively). While...
متن کاملAutomatic Clustering of Flow Cytometry Data with Density-Based Merging
The ability of flow cytometry to allow fast single cell interrogation of a large number of cells has made this technology ubiquitous and indispensable in the clinical and laboratory setting. A current limit to the potential of this technology is the lack of automated tools for analyzing the resulting data. We describe methodology and software to automatically identify cell populations in flow c...
متن کامل'1 + 1 > 2': Merging Distance and Density Based Clustering
Clustering is an important data exploration task. Its use in data mining is growing very fast. Traditional clustering algorithms which no longer cater to the data mining requirements are mod#ed increasingly. Clustering algorithms are numerous which can be divided in several categories. Two prominent categories are distance-based and density-based (e.g. K-means and DBSCAN, respectively). While K...
متن کاملInitialization Free Graph Based Clustering
This paper proposes an original approach to cluster multi-component data sets, including an estimation of the number of clusters. From the construction of a minimal spanning tree with Prim’s algorithm, and the assumption that the vertices are approximately distributed according to a Poisson distribution, the number of clusters is estimated by thresholding the Prim’s trajectory. The correspondin...
متن کاملImprovement of density-based clustering algorithm using modifying the density definitions and input parameter
Clustering is one of the main tasks in data mining, which means grouping similar samples. In general, there is a wide variety of clustering algorithms. One of these categories is density-based clustering. Various algorithms have been proposed for this method; one of the most widely used algorithms called DBSCAN. DBSCAN can identify clusters of different shapes in the dataset and automatically i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Turkish Journal of Electrical Engineering and Computer Sciences
سال: 2022
ISSN: ['1300-0632', '1303-6203']
DOI: https://doi.org/10.55730/1300-0632.3794